Goto

Collaborating Authors

 inference mode


From Research to Reality: Feasibility of Gradient Inversion Attacks in Federated Learning

Valadi, Viktor, Åkesson, Mattias, Östman, Johan, Toor, Salman, Hellander, Andreas

arXiv.org Artificial Intelligence

Gradient inversion attacks have garnered attention for their ability to compromise privacy in federated learning. However, many studies consider attacks with the model in inference mode, where training-time behaviors like dropout are disabled and batch normalization relies on fixed statistics. In this work, we systematically analyze how architecture and training behavior affect vulnerability, including the first in-depth study of inference-mode clients, which we show dramatically simplifies inversion. To assess attack feasibility under more realistic conditions, we turn to clients operating in standard training mode. In this setting, we find that successful attacks are only possible when several architectural conditions are met simultaneously: models must be shallow and wide, use skip connections, and, critically, employ pre-activation normalization. We introduce two novel attacks against models in training-mode with varying attacker knowledge, achieving state-of-the-art performance under realistic training conditions. We extend these efforts by presenting the first attack on a production-grade object-detection model. Here, to enable any visibly identifiable leakage, we revert to the lenient inference mode setting and make multiple architectural modifications to increase model vulnerability, with the extent of required changes highlighting the strong inherent robustness of such architectures. We conclude this work by offering the first comprehensive mapping of settings, clarifying which combinations of architectural choices and operational modes meaningfully impact privacy. Our analysis provides actionable insight into when models are likely vulnerable, when they appear robust, and where subtle leakage may persist. Together, these findings reframe how gradient inversion risk should be assessed in future research and deployment scenarios.


X-VFL: A New Vertical Federated Learning Framework with Cross Completion and Decision Subspace Alignment

Yao, Qinghua, Xu, Xiangrui, Li, Zhize

arXiv.org Artificial Intelligence

Vertical Federated Learning (VFL) enables collaborative learning by integrating disjoint feature subsets from multiple clients/parties. However, VFL typically faces two key challenges: i) the requirement for perfectly aligned data samples across all clients (missing features are not allowed); ii) the requirement for joint collaborative inference/prediction involving all clients (it does not support locally independent inference on a single client). To address these challenges, we propose X-VFL, a new VFL framework designed to deal with the non-aligned data samples with (partially) missing features and to support locally independent inference of new data samples for each client. In particular, we design two novel modules in X-VFL: Cross Completion (XCom) and Decision Subspace Alignment (DS-Align). XCom can complete/reconstruct missing features for non-aligned data samples by leveraging information from other clients. DS-Align aligns local features with completed and global features across all clients within the decision subspace, thus enabling locally independent inference at each client. Moreover, we provide convergence theorems for different algorithms used in training X-VFL, showing an $O(1/\sqrt{T})$ convergence rate for SGD-type algorithms and an $O(1/T)$ rate for PAGE-type algorithms, where $T$ denotes the number of training update steps. Extensive experiments on real-world datasets demonstrate that X-VFL significantly outperforms existing methods, e.g., achieving a 15% improvement in accuracy on the image CIFAR-10 dataset and a 43% improvement on the medical MIMIC-III dataset. These results validate the practical effectiveness and superiority of X-VFL, particularly in scenarios involving partially missing features and locally independent inference.


Enhancing Predictive Maintenance in Mining Mobile Machinery through a TinyML-enabled Hierarchical Inference Network

de la Fuente, Raúl, Radrigan, Luciano, Morales, Anibal S

arXiv.org Artificial Intelligence

Mining machinery operating in variable environments faces high wear and unpredictable stress, challenging Predictive Maintenance (PdM). This paper introduces the Edge Sensor Network for Predictive Maintenance (ESN-PdM), a hierarchical inference framework across edge devices, gateways, and cloud services for real-time condition monitoring. The system dynamically adjusts inference locations--on-device, on-gateway, or on-cloud--based on trade-offs among accuracy, latency, and battery life, leveraging Tiny Machine Learning (TinyML) techniques for model optimization on resource-constrained devices. Performance evaluations showed that on-sensor and on-gateway inference modes achieved over 90\% classification accuracy, while cloud-based inference reached 99\%. On-sensor inference reduced power consumption by approximately 44\%, enabling up to 104 hours of operation. Latency was lowest for on-device inference (3.33 ms), increasing when offloading to the gateway (146.67 ms) or cloud (641.71 ms). The ESN-PdM framework provides a scalable, adaptive solution for reliable anomaly detection and PdM, crucial for maintaining machinery uptime in remote environments. By balancing accuracy, latency, and energy consumption, this approach advances PdM frameworks for industrial applications.


ProFuser: Progressive Fusion of Large Language Models

Shi, Tianyuan, Wan, Fanqi, Huang, Canbin, Quan, Xiaojun, Li, Chenliang, Yan, Ming, Zhang, Ji

arXiv.org Artificial Intelligence

While fusing the capacities and advantages of various large language models (LLMs) offers a pathway to construct more powerful and versatile models, a fundamental challenge is to properly select advantageous model during the training. Existing fusion methods primarily focus on the training mode that uses cross entropy on ground truth in a teacher-forcing setup to measure a model's advantage, which may provide limited insight towards model advantage. In this paper, we introduce a novel approach that enhances the fusion process by incorporating both the training and inference modes. Our method evaluates model advantage not only through cross entropy during training but also by considering inference outputs, providing a more comprehensive assessment. To combine the two modes effectively, we introduce ProFuser to progressively transition from inference mode to training mode. To validate ProFuser's effectiveness, we fused three models, including vicuna-7b-v1.5, Llama-2-7b-chat, and mpt-7b-8k-chat, and demonstrated the improved performance in knowledge, reasoning, and safety compared to baseline methods.


Deeploy: Enabling Energy-Efficient Deployment of Small Language Models On Heterogeneous Microcontrollers

Scherer, Moritz, Macan, Luka, Jung, Victor, Wiese, Philip, Bompani, Luca, Burrello, Alessio, Conti, Francesco, Benini, Luca

arXiv.org Artificial Intelligence

Despite many recent successes with previous-generation Deep The latest evolutions in mainstream Artificial Intelligence (AI) Neural Networks (DNNs), the emergence of the tinyML paradigm have been driven by Transformers, which have taken over from for EFMs faces the dual challenge of reducing FMs to a manageable Recurrent Neural Networks (RNNs) and Convolutional Neural size and enabling their deployment on tiny devices. Networks (CNNs) as the leading edge models for language A first concrete step in this direction is the recent introduction of processing and multi-modal applications [1], [2]. The success of Small Language Models (SLMs): FMs with tens to a few hundred Transformers can be primarily attributed to the emergence of the million, rather than several billion parameters [8], [9]. While Foundation Model (FM) paradigm: large Transformer models most currently available FMs are focused on processing natural extensively pre-trained on datasets spanning trillions of tokens and language at a proof-of-concept scale, the effort towards embedded then fine-tuned with a much lower volume of labeled data to solve multi-modal sensor inputs with small-scale, application-specific domain-specific problems. Following the success of FMs in Natural FMs offers a highly promising path for the development of this Language Processing (NLP) [1], [3], an increasing number of fields novel class of models.


Sparsifying Spiking Networks through Local Rhythms

Olin-Ammentorp, Wilkie

arXiv.org Artificial Intelligence

It has been well-established that within conventional neural networks, many of the values produced at each layer are zero. In this work, I demonstrate that spiking neural networks can prevent the transmission of spikes representing values close to zero using local information. This can reduce the amount of energy required for communication and computation in these networks while preserving accuracy. Additionally, this demonstrates a novel application of biologically observed spiking rhythms.


Using Whisper (speech-to-text) and Tortoise (text-to-speech)

#artificialintelligence

I’ll demonstrate how to extract an audio clip from YouTube, implement speech recognition using OpenAI’s Whisper, and perform speech generation using Tortoise to clone a custom voice.


Clothes and color extraction with Generative Adversarial Network

#artificialintelligence

So how can we solve this problem? I tried to replace the background of the original image with a solid color manually and realized that with this kind of input the model produces much better results. But how can this job be automated?


Making AI FaaSt

#artificialintelligence

Drascalita Haut: Today we're going to talk about functions, and in particular Functions as a Service. It applies it to AI in order to present a solution that seems to bring strategic advantages when deploying AI services at scale. During this session, it may feel like we're dancing a bit, moving through tools, new technologies, maybe you might even see some new steps like workflows or methods to work with AI. And for those of you that know salsa, you know that it starts with a step forward. So today, I'm going to start with some bold statements, but bear with us, I'm going to take a step back, and then me and AK are going to rehearse something through a live demo, which hopefully is going to go just fine, to illustrate what we're talking about. Let me start with a step forward, FaaS value prop. What does FaaS bring that more and more people are talking about? I came with three reasons. Number one is FaaSter to prototype, FaaSter to create services, because we work with code, with functions, just code, and we just push the code as it is. Second, never pay for idle. FaaS platforms have the capability to shut down the parts of the system that are not used, so we don't incur any cost. And the third one is a low maintenance overhead. That's because FaaS platforms usually take away the burden to create containers, keeping them up to date, apply security updates, auto-scaling the functions, deploy them in multiple regions. In other words, FaaS boldly claims that you will find it easier to build more services, and you're going to pay less. Now, this is a pretty bold statement, isn't it? So allow me to take a step back and look at how developers are producing microservices today. A few years ago, we realized that microservices are better than monoliths because in essence, they add flexibility, and they simplify the experience. At the same time, it's also less risky to independently update parts of the system. And I would assume that many of us know what microservices are. A very high-level microservice architecture is in this slide. So the final solution basically consists of isolated pieces, with its own independent deployment lifecycle. Now, microservices used to be deployed in their own VMs, and then containers came and it was such a revolution because we're able to correctly run multiple services in isolation in the same VM.